Filaments: Efficient Support for Fine-Grain Parallelism
نویسندگان
چکیده
It has long been thought that coarse-grain parallelism is much more efficient than fine-grain parallelism due to the overhead of process (thread) creation, context switching, and synchronization. On the other hand, there are several advantages to fine-grain parallelism: architecture independence, ease of programming, ease of use as a target for code generation, and load-balancing potential. This paper describes a portable threads package, Filaments, that supports efficient execution of fine-grain parallel programs on shared-memory multiprocessors. Filaments supports three kinds of threads—run-to-completion, barrier (iterative), and fork/join— which appear to be sufficient for scientific computations. Filaments employs a unique combination of techniques to achieve efficiency: stateless threads, very small thread descriptors, optimized barrier synchronization, scheduling that enhances data locality, and automatic pruning of fork/join threads. The gains in performance are such that on an application such as Jacobi iteration, the execution time for a fine-grain program with a worst-case granularity of a thread per point can be within 10% of that for a coarse-grain program with only one task per processor. Execution times for problems with more work per thread are usually indistinguishable from coarse-grain programs, and they can be faster when the amount of work per thread varies.
منابع مشابه
DEPARTMENT OF COMPUTER SCIENCE Filaments: Efficient Support for Fine-Grain Parallelism
It has long been thought that coarse-grain parallelism is much more efficient than fine-grain parallelism due to the overhead of process (thread) creation, context switching, and synchronization. On the other hand, there are several advantages to fine-grain parallelism: architecture independence, ease of programming, ease of use as a target for code generation, and load-balancing potential. Thi...
متن کاملEfficient support for fine-grain parallelism on shared-memory machines
A coarse-grain parallel program typically has one thread (task) per processor, whereas a fine-grain program has one thread for each independent unit of work. Although there are several advantages to fine-grain parallelism, conventional wisdom is that coarse-grain parallelism is more efficient. This paper illustrates the advantages of fine-grain parallelism and presents an efficient implementati...
متن کاملDistributed Filaments: Eecient Fine-grain Parallelism on a Cluster of Workstations Distributed Filaments: Eecient Fine-grain Parallelism on a Cluster of Workstations
A ne-grain parallel program is one in which processes are typically small, ranging from a few to a few hundred instructions. Fine-grain parallelism arises naturally in many situations, such as iterative grid computations, recursive fork/join programs, the bodies of parallel FOR loops, and the implicit parallelism in functional or dataaow languages. It is useful both to describe massively parall...
متن کاملcient Support for Fine - Grain Parallelism onShared - Memory Machines
A coarse-grain parallel program typically has one thread (task) per processor, whereas a ne-grain program has one thread for each independent unit of work. Although there are several advantages to ne-grain parallelism, conventional wisdom is that coarse-grain parallelism is more eecient. This paper illustrates the advantages of ne-grain parallelism and presents an eecient implementation for sha...
متن کاملE cient Support for Fine - Grain Parallelism on Shared
A coarse-grain parallel program typically has one thread (task) per processor, whereas a ne-grain program has one thread for each independent unit of work. Although there are several advantages to ne-grain parallelism, conventional wisdom is that coarse-grain parallelism is more eecient. This paper illustrates the advantages of ne-grain parallelism and presents an eecient implementation for sha...
متن کامل